Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 46(6): 4366-4380, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38236683

RESUMO

Fine-grained image retrieval mainly focuses on learning salient features from the seen subcategories as discriminative embedding while neglecting the problems behind zero-shot settings. We argue that retrieving fine-grained objects from unseen subcategories may rely on more diverse clues, which are easily restrained by the salient features learnt from seen subcategories. To address this issue, we propose a novel Content-aware Rectified Activation model, which enables this model to suppress the activation on salient regions while preserving their discrimination, and spread activation to adjacent non-salient regions, thus mining more diverse discriminative features for retrieving unseen subcategories. Specifically, we construct a content-aware rectified prototype (CARP) by perceiving semantics of salient regions. CARP acts as a channel-wise non-destructive activation upper bound and can be selectively used to suppress salient regions for obtaining the rectified features. Moreover, two regularizations are proposed: 1) a semantic coherency constraint that imposes a restriction on semantic coherency of CARP and salient regions, aiming at propagating the discriminative ability of salient regions to CARP, 2) a feature-navigated constraint to further guide the model to adaptively balance the discrimination power of rectified features and the suppression power of salient features. Experimental results on fine-grained and product retrieval benchmarks demonstrate that our method consistently outperforms the state-of-the-art methods.

3.
IEEE Trans Image Process ; 31: 1134-1148, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34932477

RESUMO

The success of deep convolutional networks (ConvNets) generally relies on a massive amount of well-labeled data, which is labor-intensive and time-consuming to collect and annotate in many scenarios. To eliminate such limitation, self-supervised learning (SSL) is recently proposed. Specifically, by solving a pre-designed proxy task, SSL is capable of capturing general-purpose features without requiring human supervision. Existing efforts focus obsessively on designing a particular proxy task but ignore the semanticity of samples that are advantageous to downstream tasks, resulting in the inherent limitation that the learned features are specific to the proxy task, namely the proxy task-specificity of features. In this work, to improve the generalizability of features learned by existing SSL methods, we present a novel self-supervised framework SSL++ to incorporate the proxy task-independent semanticity of samples into the representation learning process. Technically, SSL++ aims to leverage the complementarity, between the low-level generic features learned by a proxy task and the high-level semantic features newly learned by the generated semantic pseudo-labels, to mitigate the task-specificity and improve the generalizability of features. Extensive experiments show that SSL++ performs favorably against the state-of-the-art approaches on the established and latest SSL benchmarks.


Assuntos
Aprendizado de Máquina Supervisionado , Humanos
4.
IEEE Trans Image Process ; 30: 6512-6527, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34252026

RESUMO

Deep learning (DL) is inherently subject to the requirement of a large amount of well-labeled data, which is expensive and time-consuming to obtain manually. In order to broaden the reach of DL, leveraging free web data becomes an attractive strategy to alleviate the issue of data scarcity. However, directly utilizing collected web data to train a deep model is ineffective because of the mixed noisy data. To address such problems, we develop a novel bidirectional self-paced learning (BiSPL) framework which reduces the effect of noise by learning from web data in a meaningful order. Technically, the BiSPL framework consists of two essential steps. Relying on distances defined between web samples and labeled source samples, first, the web samples with short distances are sampled and combined to form a new training set. Second, based on the new training set, both easy and hard samples are initially employed to train deep models for higher stability, and hard samples are gradually dropped to reduce the noise as the training progresses. By iteratively alternating such steps, deep models converge to a better solution. We mainly focus on the fine-grained visual classification (FGVC) tasks because their corresponding datasets are generally small and therefore face a more significant data scarcity problem. Experiments conducted on six public FGVC tasks demonstrate that our proposed method outperforms the state-of-the-art approaches. Especially, BiSPL suffices to achieve the highest stable performance when the scale of the well-labeled training set decreases dramatically.

5.
Sci Rep ; 11(1): 3408, 2021 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-33564082

RESUMO

By combining the synthetic jet and film cooling, the incident cooling flow is specially treated to find a better film cooling method. Numerical simulations of the synthetic coolant ejected are carried out for analyzing the cooling performance in detail, under different blowing ratios, hole patterns, Strouhal numbers, and various orders of incidence for the two rows of holes. By comparing the flow structures and the cooling effect corresponding to the synthetic coolant and the steady coolant fields, it is found that within the scope of the investigations, the best cooling effect can be obtained under the incident conditions of an elliptical hole with the aspect ratio of 0.618, the blow molding ratio of 2.5, and the Strouhal number St = 0.22. Due to the strong controllability of the synthetic coolant, the synthetic coolant can be controlled through adjusting the frequency of blowing and suction, so as to change the interaction between vortex structures for improving film cooling effect in turn. As a result, the synthetic coolant ejection is more advisable in certain conditions to achieve better outcomes.

6.
IEEE Trans Pattern Anal Mach Intell ; 43(9): 2905-2920, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-32866094

RESUMO

Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.

7.
IEEE Trans Pattern Anal Mach Intell ; 42(4): 809-823, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30596571

RESUMO

Clustering is a crucial but challenging task in pattern analysis and machine learning. Existing methods often ignore the combination between representation learning and clustering. To tackle this problem, we reconsider the clustering task from its definition to develop Deep Self-Evolution Clustering (DSEC) to jointly learn representations and cluster data. For this purpose, the clustering task is recast as a binary pairwise-classification problem to estimate whether pairwise patterns are similar. Specifically, similarities between pairwise patterns are defined by the dot product between indicator features which are generated by a deep neural network (DNN). To learn informative representations for clustering, clustering constraints are imposed on the indicator features to represent specific concepts with specific representations. Since the ground-truth similarities are unavailable in clustering, an alternating iterative algorithm called Self-Evolution Clustering Training (SECT) is presented to select similar and dissimilar pairwise patterns and to train the DNN alternately. Consequently, the indicator features tend to be one-hot vectors and the patterns can be clustered by locating the largest response of the learned indicator features. Extensive experiments strongly evidence that DSEC outperforms current models on twelve popular image, text and audio datasets consistently.

8.
IEEE Trans Pattern Anal Mach Intell ; 42(11): 2874-2886, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31071020

RESUMO

Convolutional neural networks (CNNs) provide a dramatically powerful class of models, but are subject to traditional convolution that can merely aggregate permutation-ordered and dimension-equal local inputs. It causes that CNNs are allowed to only manage signals on Euclidean or grid-like domains (e.g., images), not ones on non-Euclidean or graph domains (e.g., traffic networks). To eliminate this limitation, we develop a local-aggregation function, a sharable nonlinear operation, to aggregate permutation-unordered and dimension-unequal local inputs on non-Euclidean domains. In the context of the function approximation theory, the local-aggregation function is parameterized with a group of orthonormal polynomials in an effective and efficient manner. By replacing the traditional convolution in CNNs with the parameterized local-aggregation function, Local-Aggregation Graph Networks (LAGNs) are readily established, which enable to fit nonlinear functions without activation functions and can be expediently trained with the standard back-propagation. Extensive experiments on various datasets strongly demonstrate the effectiveness and efficiency of LAGNs, leading to superior performance on numerous pattern recognition and machine learning tasks, including text categorization, molecular activity detection, taxi flow prediction, and image classification.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...